AITopics | monte carlo method

Collaborating Authors

monte carlo method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Variance Reduction in Stochastic Gradient Langevin Dynamics

Kumar Avinava Dubey, Sashank J. Reddi, Sinead A. Williamson, Barnabas Poczos, Alexander J. Smola, Eric P. Xing

Neural Information Processing SystemsMay-1-2026, 05:56:42 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, gradient, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)

Add feedback

Variance Reduction in Stochastic Gradient Langevin Dynamics

Neural Information Processing SystemsMar-17-2026, 10:02:15 GMT

Stochastic gradient-based Monte Carlo methods such as stochastic gradient Langevin dynamics are useful tools for posterior inference on large scale datasets in many machine learning applications. These methods scale to large datasets by using noisy gradients calculated using a mini-batch or subset of the dataset. However, the high variance inherent in these noisy gradients degrades performance and leads to slower mixing. In this paper, we present techniques for reducing variance in stochastic gradient Langevin dynamics, yielding novel stochastic Monte Carlo methods that improve performance by reducing the variance in the stochastic gradient. We show that our proposed method has better theoretical guarantees on convergence rate than stochastic Langevin dynamics. This is complemented by impressive empirical results obtained on a variety of real world datasets, and on four different machine learning tasks (regression, classification, independent component analysis and mixture modeling). These theoretical and empirical contributions combine to make a compelling case for using variance reduction in stochastic Monte Carlo methods.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Efficient Variance-reduced Estimation from Generative EHR Models: The SCOPE and REACH Estimators

Solo, Luke, McDermott, Matthew B. A., Parker, William F., Ramadan, Bashar, Burkhart, Michael C., Beaulieu-Jones, Brett K.

arXiv.org Machine LearningFeb-4-2026

Generative models trained using self-supervision of tokenized electronic health record (EHR) timelines show promise for clinical outcome prediction. This is typically done using Monte Carlo simulation for future patient trajectories. However, existing approaches suffer from three key limitations: sparse estimate distributions that poorly differentiate patient risk levels, extreme computational costs, and high sampling variance. We propose two new estimators: the Sum of Conditional Outcome Probability Estimator (SCOPE) and Risk Estimation from Anticipated Conditional Hazards (REACH), that leverage next-token probability distributions discarded by standard Monte Carlo. We prove both estimators are unbiased and that REACH guarantees variance reduction over Monte Carlo sampling for any model and outcome. Empirically, on hospital mortality prediction in MIMIC-IV using the ETHOS-ARES framework, SCOPE and REACH match 100-sample Monte Carlo performance using only 10-11 samples (95% CI: [9,11]), representing a ~10x reduction in inference cost without degrading calibration. For ICU admission prediction, efficiency gains are more modest (~1.2x), which we attribute to the outcome's lower "spontaneity," a property we characterize theoretically and empirically. These methods substantially improve the feasibility of deploying generative EHR models in resource-constrained clinical settings.

large language model, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2602.0373

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Variance Reduction in Stochastic Gradient Langevin Dynamics

Neural Information Processing SystemsNov-21-2025, 15:06:44 GMT

name change, stochastic gradient langevin dynamic, variance reduction, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Particle Monte Carlo methods for Lattice Field Theory

Yallup, David

arXiv.org Machine LearningNov-20-2025

High-dimensional multimodal sampling problems from lattice field theory (LFT) have become important benchmarks for machine learning assisted sampling methods. We show that GPU-accelerated particle methods, Sequential Monte Carlo (SMC) and nested sampling, provide a strong classical baseline that matches or outperforms state-of-the-art neural samplers in sample quality and wall-clock time on standard scalar field theory benchmarks, while also estimating the partition function. Using only a single data-driven covariance for tuning, these methods achieve competitive performance without problem-specific structure, raising the bar for when learned proposals justify their training cost.

artificial intelligence, machine learning, sampler, (17 more...)

arXiv.org Machine Learning

2511.15196

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

On the flow matching interpretability

Pivi, Francesco, Gazza, Simone, Evangelista, Davide, Amadini, Roberto, Gabbrielli, Maurizio

arXiv.org Artificial IntelligenceOct-27-2025

Generative models based on flow matching have demonstrated remarkable success in various domains, yet they suffer from a fundamental limitation: the lack of interpretability in their intermediate generation steps. In fact these models learn to transform noise into data through a series of vector field updates, however the meaning of each step remains opaque. We address this problem by proposing a general framework constraining each flow step to be sampled from a known physical distribution. Flow trajectories are mapped to (and constrained to traverse) the equilibrium states of the simulated physical process. We implement this approach through the 2D Ising model in such a way that flow steps become thermal equilibrium points along a parametric cooling schedule. Our proposed architecture includes an encoder that maps discrete Ising configurations into a continuous latent space, a flow-matching network that performs temperature-driven diffusion, and a projector that returns to discrete Ising states while preserving physical constraints. We validate this framework across multiple lattice sizes, showing that it preserves physical fidelity while outperforming Monte Carlo generation in speed as the lattice size increases. In contrast with standard flow matching, each vector field represents a meaningful stepwise transition in the 2D Ising model's latent space. This demonstrates that embedding physical semantics into generative flows transforms opaque neural trajectories into interpretable physical processes.

artificial intelligence, interpretability, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.2121

Country: Europe > Italy (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

272e11700558e27be60f7489d2d782e7-Paper.pdf

Neural Information Processing SystemsSep-24-2025, 16:43:30 GMT

artificial intelligence, iteration, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Industry:

Energy > Oil & Gas > Upstream (0.93)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

Repulsive Monte Carlo on the sphere for the sliced Wasserstein distance

Petrovic, Vladimir, Bardenet, Rémi, Desolneux, Agnès

arXiv.org Machine LearningSep-15-2025

In this paper, we consider the problem of computing the integral of a function on the unit sphere, in any dimension, using Monte Carlo methods. Although the methods we present are general, our guiding thread is the sliced Wasserstein distance between two measures on $\mathbb{R}^d$, which is precisely an integral on the $d$-dimensional sphere. The sliced Wasserstein distance (SW) has gained momentum in machine learning either as a proxy to the less computationally tractable Wasserstein distance, or as a distance in its own right, due in particular to its built-in alleviation of the curse of dimensionality. There has been recent numerical benchmarks of quadratures for the sliced Wasserstein, and our viewpoint differs in that we concentrate on quadratures where the nodes are repulsive, i.e. negatively dependent. Indeed, negative dependence can bring variance reduction when the quadrature is adapted to the integration task. Our first contribution is to extract and motivate quadratures from the recent literature on determinantal point processes (DPPs) and repelled point processes, as well as repulsive quadratures from the literature specific to the sliced Wasserstein distance. We then numerically benchmark these quadratures. Moreover, we analyze the variance of the UnifOrtho estimator, an orthogonal Monte Carlo estimator. Our analysis sheds light on UnifOrtho's success for the estimation of the sliced Wasserstein in large dimensions, as well as counterexamples from the literature. Our final recommendation for the computation of the sliced Wasserstein distance is to use randomized quasi-Monte Carlo in low dimensions and \emph{UnifOrtho} in large dimensions. DPP-based quadratures only shine when quasi-Monte Carlo also does, while repelled quadratures show moderate variance reduction in general, but more theoretical effort is needed to make them robust.

estimator, integrand, wasserstein distance, (14 more...)

arXiv.org Machine Learning

2509.10166

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Variance reduction for Random Coordinate Descent-Langevin Monte Carlo

Neural Information Processing SystemsAug-13-2025, 23:09:02 GMT

There is vast literature on sampling, and proposed methods fall into a few different categories.

artificial intelligence, iteration, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Industry:

Energy > Oil & Gas > Upstream (0.93)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

Quasi-Random Physics-informed Neural Networks

Yu, Tianchi, Oseledets, Ivan

arXiv.org Artificial IntelligenceJul-14-2025

Physics-informed neural networks have shown promise in solving partial differential equations (PDEs) by integrating physical constraints into neural network training, but their performance is sensitive to the sampling of points. Based on the impressive performance of quasi Monte-Carlo methods in high dimensional problems, this paper proposes Quasi-Random Physics-Informed Neural Networks (QRPINNs), which use low-discrepancy sequences for sampling instead of random points directly from the domain. Theoretically, QRPINNs have been proven to have a better convergence rate than PINNs. Empirically, experiments demonstrate that QRPINNs significantly outperform PINNs and some representative adaptive sampling methods, especially in high-dimensional PDEs. Furthermore, combining QRPINNs with adaptive sampling can further improve the performance.

artificial intelligence, machine learning, sequence, (17 more...)

arXiv.org Artificial Intelligence

2507.08121

Country:

Asia (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback